Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replicas should be able to heal if replication is not initialised properly #10943

Merged
merged 2 commits into from
Aug 9, 2022

Conversation

GuptaManan100
Copy link
Member

@GuptaManan100 GuptaManan100 commented Aug 4, 2022

Description

This PR addresses the issue raised in #10955. The way to repair replication once it is stuck as described in #10955, is to call RESET REPLICA and then trying START REPLICA again.

The proposed solution is to always RESET REPLICA in setReplicationSourceLocked when we change the primary source/port or when forceStartReplication is specified.

This allows both VTOrc and the replication manager to be able to address this issue since they both call setReplicationSourceLocked with forceStartReplication.

Related Issue(s)

Checklist

  • "Backport me!" label has been added if this change should be backported
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

@vitess-bot
Copy link
Contributor

vitess-bot bot commented Aug 4, 2022

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.
  • If a new flag is being introduced, review whether it is really needed. The flag names should be clear and intuitive (as far as possible), and the flag's help should be descriptive.
  • If a workflow is added or modified, each items in Jobs should be named in order to mark it as required. If the workflow should be required, the GitHub Admin should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should either include a link to an issue that describes the bug OR an actual description of the bug and how to reproduce, along with a description of the fix.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.

@deepthi
Copy link
Member

deepthi commented Aug 6, 2022

Can you create an issue that describes the problem? I think it is important for this particular issue for us to document it.

@GuptaManan100 GuptaManan100 added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: Cluster management Component: VTorc Vitess Orchestrator integration labels Aug 8, 2022
@GuptaManan100 GuptaManan100 marked this pull request as ready for review August 8, 2022 09:20
@GuptaManan100 GuptaManan100 changed the title Shard replica issue Replicas should be able to heal if replication is not initialised properly Aug 8, 2022
Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! ❤️

Copy link
Member

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. LGTM

@deepthi deepthi merged commit 7f25195 into vitessio:main Aug 9, 2022
@deepthi deepthi deleted the shard-replica-issue branch August 9, 2022 01:09
systay pushed a commit to planetscale/vitess that referenced this pull request Aug 19, 2022
…tialised properly vitessio#10943 (vitessio#935)

* Replicas should be able to heal if replication is not initialised properly (vitessio#10943)

* feat: add code to also reset replication parameters in setReplicationSourceLocked when required

Signed-off-by: Manan Gupta <[email protected]>

* test: fix tests to reflect the change

Signed-off-by: Manan Gupta <[email protected]>

* feat: fix vtworker tests

Signed-off-by: Manan Gupta <[email protected]>
timvaillancourt pushed a commit to slackhq/vitess that referenced this pull request Aug 16, 2023
…perly (vitessio#10943)

* feat: add code to also reset replication parameters in setReplicationSourceLocked when required

Signed-off-by: Manan Gupta <[email protected]>

* test: fix tests to reflect the change

Signed-off-by: Manan Gupta <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Cluster management Component: VTorc Vitess Orchestrator integration Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug Report: replicas do not self heal when replication initialization fails
3 participants